dgit.raspbian.org Git

tools/ocaml/xenstored: Fix path length validation

Currently, oxenstored checks the length of paths against 1024, then
prepends "/local/domain/$DOMID/" to relative paths.  This allows a domU
to create paths which can't subsequently be read by anyone, even dom0.
This also interferes with listing directories, etc.

Define a new oxenstored.conf entry: quota-path-max, defaulting to 1024
as before.  For paths that begin with "/local/domain/$DOMID/" check the
relative path length against this quota. For all other paths check the
entire path length.

This ensures that if the domid changes (and thus the length of a prefix
changes) a path that used to be valid stays valid (e.g. after a
live-migration).  It also ensures that regardless how the client tries
to access a path (domid-relative or absolute) it will get consistent
results, since the limit is always applied on the final canonicalized
path.

Delete the unused Domain.get_path to avoid it being confused with
Connection.get_path (which differs by a trailing slash only).

Rewrite Util.path_validate to apply the appropriate length restriction
based on whether the path is relative or not.  Remove the check for
connection_path being absolute, because it is not guest controlled data.

This is part of XSA-323.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>

tools/ocaml/xenstored: clean up permissions for dead domains

domain ids are prone to wrapping (15-bits), and with sufficient number
of VMs in a reboot loop it is possible to trigger it.  Xenstore entries
may linger after a domain dies, until a toolstack cleans it up. During
this time there is a window where a wrapped domid could access these
xenstore keys (that belonged to another VM).

To prevent this do a cleanup when a domain dies:
* walk the entire xenstore tree and update permissions for all nodes
   * if the dead domain had an ACL entry: remove it
   * if the dead domain was the owner: change the owner to Dom0

This is done without quota checks or a transaction. Quota checks would
be a no-op (either the domain is dead, or it is Dom0 where they are not
enforced).  Transactions are not needed, because this is all done
atomically by oxenstored's single thread.

The xenstore entries owned by the dead domain are not deleted, because
that could confuse a toolstack / backends that are still bound to it
(or generate unexpected watch events). It is the responsibility of a
toolstack to remove the xenstore entries themselves.

This is part of XSA-322.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>

tools/xenstore: revoke access rights for removed domains

Access rights of Xenstore nodes are per domid. Unfortunately existing
granted access rights are not removed when a domain is being destroyed.
This means that a new domain created with the same domid will inherit
the access rights to Xenstore nodes from the previous domain(s) with
the same domid.

This can be avoided by adding a generation counter to each domain.
The generation counter of the domain is set to the global generation
counter when a domain structure is being allocated. When reading or
writing a node all permissions of domains which are younger than the
node itself are dropped. This is done by flagging the related entry
as invalid in order to avoid modifying permissions in a way the user
could detect.

A special case has to be considered: for a new domain the first
Xenstore entries are already written before the domain is officially
introduced in Xenstore. In order not to drop the permissions for the
new domain a domain struct is allocated even before introduction if
the hypervisor is aware of the domain. This requires adding another
bool "introduced" to struct domain in xenstored. In order to avoid
additional padding holes convert the shutdown flag to bool, too.

As verifying permissions has its price regarding runtime add a new
quota for limiting the number of permissions an unprivileged domain
can set for a node. The default for that new quota is 5.

This is part of XSA-322.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Acked-by: Julien Grall <julien@amazon.com>

tools/ocaml/xenstored: add xenstored.conf flag to turn off watch permission checks

There are flags to turn off quotas and the permission system, so add one
that turns off the newly introduced watch permission checks as well.

This is part of XSA-115.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

tools/ocaml/xenstored: avoid watch events for nodes without access

Today watch events are sent regardless of the access rights of the
node the event is sent for. This enables any guest to e.g. setup a
watch for "/" in order to have a detailed record of all Xenstore
modifications.

Modify that by sending only watch events for nodes that the watcher
has a chance to see otherwise (either via direct reads or by querying
the children of a node). This includes cases where the visibility of
a node for a watcher is changing (permissions being removed).

Permissions for nodes are looked up either in the old (pre
transaction/command) or current trees (post transaction).  If
permissions are changed multiple times in a transaction only the final
version is checked, because considering a transaction atomic the
individual permission changes would not be noticable to an outside
observer.

Two trees are only needed for set_perms: here we can either notice the
node disappearing (if we loose permission), appearing
(if we gain permission), or changing (if we preserve permission).

RM needs to only look at the old tree: in the new tree the node would be
gone, or could have different permissions if it was recreated (the
recreation would get its own watch fired).

Inside a tree we lookup the watch path's parent, and then the watch path
child itself.  This gets us 4 sets of permissions in worst case, and if
either of these allows a watch, then we permit it to fire.  The
permission lookups are done without logging the failures, otherwise we'd
get confusing errors about permission denied for some paths, but a watch
still firing. The actual result is logged in xenstored-access log:

  'w event ...' as usual if watch was fired
  'w notfired...' if the watch was not fired, together with path and
  permission set to help in troubleshooting

Adding a watch bypasses permission checks and always fires the watch
once immediately. This is consistent with the specification, and no
information is gained (the watch is fired both if the path exists or
doesn't, and both if you have or don't have access, i.e. it reflects the
path a domain gave it back to that domain).

There are some semantic changes here:

  * Write+rm in a single transaction of the same path is unobservable
    now via watches: both before and after a transaction the path
    doesn't exist, thus both tree lookups come up with the empty
    permission set, and noone, not even Dom0 can see this. This is
    consistent with transaction atomicity though.
  * Similar to above if we temporarily grant and then revoke permission
    on a path any watches fired inbetween are ignored as well
  * There is a new log event (w notfired) which shows the permission set
    of the path, and the path.
  * Watches on paths that a domain doesn't have access to are now not
    seen, which is the purpose of the security fix.

This is part of XSA-115.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

tools/ocaml/xenstored: introduce permissions for special watches

The special watches "@introduceDomain" and "@releaseDomain" should be
allowed for privileged callers only, as they allow to gain information
about presence of other guests on the host. So send watch events for
those watches via privileged connections only.

Start to address this by treating the special watches as regular nodes
in the tree, which gives them normal semantics for permissions. A later
change will restrict the handling, so that they can't be listed, etc.

This is part of XSA-115.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

tools/ocaml/xenstored: unify watch firing

This will make it easier insert additional checks in a follow-up patch.
All watches are now fired from a single function.

This is part of XSA-115.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

tools/ocaml/xenstored: check privilege for XS_IS_DOMAIN_INTRODUCED

The Xenstore command XS_IS_DOMAIN_INTRODUCED should be possible for privileged
domains only (the only user in the tree is the xenpaging daemon).

This is part of XSA-115.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

tools/ocaml/xenstored: ignore transaction id for [un]watch

Instead of ignoring the transaction id for XS_WATCH and XS_UNWATCH
commands as it is documented in docs/misc/xenstore.txt, it is tested
for validity today.

Really ignore the transaction id for XS_WATCH and XS_UNWATCH.

This is part of XSA-115.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

tools/xenstore: avoid watch events for nodes without access

Today watch events are sent regardless of the access rights of the
node the event is sent for. This enables any guest to e.g. setup a
watch for "/" in order to have a detailed record of all Xenstore
modifications.

Modify that by sending only watch events for nodes that the watcher
has a chance to see otherwise (either via direct reads or by querying
the children of a node). This includes cases where the visibility of
a node for a watcher is changing (permissions being removed).

This is part of XSA-115.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Paul Durrant <paul@xen.org>

tools/xenstore: allow special watches for privileged callers only

The special watches "@introduceDomain" and "@releaseDomain" should be
allowed for privileged callers only, as they allow to gain information
about presence of other guests on the host. So send watch events for
those watches via privileged connections only.

In order to allow for disaggregated setups where e.g. driver domains
need to make use of those special watches add support for calling
"set permissions" for those special nodes, too.

This is part of XSA-115.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Paul Durrant <paul@xen.org>

tools/xenstore: introduce node_perms structure

There are several places in xenstored using a permission array and the
size of that array. Introduce a new struct node_perms containing both.

This is part of XSA-115.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Paul Durrant <paul@xen.org>

tools/xenstore: fire watches only when removing a specific node

Instead of firing all watches for removing a subtree in one go, do so
only when the related node is being removed.

The watches for the top-most node being removed include all watches
including that node, while watches for nodes below that are only fired
if they are matching exactly. This avoids firing any watch more than
once when removing a subtree.

This is part of XSA-115.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Paul Durrant <paul@xen.org>

tools/xenstore: rework node removal

Today a Xenstore node is being removed by deleting it from the parent
first and then deleting itself and all its children. This results in
stale entries remaining in the data base in case e.g. a memory
allocation is failing during processing. This would result in the
rather strange behavior to be able to read a node (as its still in the
data base) while not being visible in the tree view of Xenstore.

Fix that by deleting the nodes from the leaf side instead of starting
at the root.

As fire_watches() is now called from _rm() the ctx parameter needs a
const attribute.

This is part of XSA-115.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Paul Durrant <paul@xen.org>

tools/xenstore: check privilege for XS_IS_DOMAIN_INTRODUCED

The Xenstore command XS_IS_DOMAIN_INTRODUCED should be possible for
privileged domains only (the only user in the tree is the xenpaging
daemon).

Instead of having the privilege test for each command introduce a
per-command flag for that purpose.

This is part of XSA-115.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Paul Durrant <paul@xen.org>

tools/xenstore: simplify and rename check_event_node()

There is no path which allows to call check_event_node() without a
event name. So don't let the result depend on the name being NULL and
add an assert() covering that case.

Rename the function to check_special_event() to better match the
semantics.

This is part of XSA-115.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Paul Durrant <paul@xen.org>

tools/xenstore: fix node accounting after failed node creation

When a node creation fails the number of nodes of the domain should be
the same as before the failed node creation. In case of failure when
trying to create a node requiring to create one or more intermediate
nodes as well (e.g. when /a/b/c/d is to be created, but /a/b isn't
existing yet) it might happen that the number of nodes of the creating
domain is not reset to the value it had before.

So move the quota accounting out of construct_node() and into the node
write loop in create_node() in order to be able to undo the accounting
in case of an error in the intermediate node destructor.

This is part of XSA-115.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Acked-by: Julien Grall <jgrall@amazon.com>

tools/xenstore: ignore transaction id for [un]watch

Instead of ignoring the transaction id for XS_WATCH and XS_UNWATCH
commands as it is documented in docs/misc/xenstore.txt, it is tested
for validity today.

Really ignore the transaction id for XS_WATCH and XS_UNWATCH.

This is part of XSA-115.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Paul Durrant <paul@xen.org>

tools/xenstore: allow removing child of a node exceeding quota

An unprivileged user of Xenstore is not allowed to write nodes with a
size exceeding a global quota, while privileged users like dom0 are
allowed to write such nodes. The size of a node is the needed space
to store all node specific data, this includes the names of all
children of the node.

When deleting a node its parent has to be modified by removing the
name of the to be deleted child from it.

This results in the strange situation that an unprivileged owner of a
node might not succeed in deleting that node in case its parent is
exceeding the quota of that unprivileged user (it might have been
written by dom0), as the user is not allowed to write the updated
parent node.

Fix that by not checking the quota when writing a node for the
purpose of removing a child's name only.

The same applies to transaction handling: a node being read during a
transaction is written to the transaction specific area and it should
not be tested for exceeding the quota, as it might not be owned by
the reader and presumably the original write would have failed if the
node is owned by the reader.

This is part of XSA-115.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Paul Durrant <paul@xen.org>

tools/ocaml/xenstored: do permission checks on xenstore root

This was lacking in a disappointing number of places.

The xenstore root node is treated differently from all other nodes, because it
doesn't have a parent, and mutation requires changing the parent.

Unfortunately this lead to open-coding the special case for root into every
single xenstore operation, and out of all the xenstore operations only read
did a permission check when handling the root node.

This means that an unprivileged guest can:

* xenstore-chmod / to its liking and subsequently write new arbitrary nodes
   there (subject to quota)
* xenstore-rm -r / deletes almost the entire xenstore tree (xenopsd quickly
   refills some, but you are left with a broken system)
* DIRECTORY on / lists all children when called through python
   bindings (xenstore-ls stops at /local because it tries to list recursively)
* get-perms on / works too, but that is just a minor information leak

Add the missing permission checks, but this should really be refactored to do
the root handling and permission checks on the node only once from a single
function, instead of getting it wrong nearly everywhere.

This is XSA-353.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

d/changelog: most stuff for 4.14.0+88-g1d1d1f5391-1

Signed-off-by: Hans van Kranenburg <hans@knorrie.org>

debian/changelog: add a few missing CVE numbers

Signed-off-by: Hans van Kranenburg <hans@knorrie.org>

debian/rules: enable verbose build

Debian policy says the package build should be as verbose as reasonably
possible. Also this helps with debugging build issues.

Signed-off-by: Maximilian Engelhardt <maxi@daemonizer.de>
Reviewed-by: Hans van Kranenburg <hans@knorrie.org>

d/xen-hypervisor-V.bug-control.vsn-in: actually install script

Rename debian/xen-hypervisor-V.bug-control.vsn-in to
debian/xen-hypervisor-V-F.bug-control.vsn-in so it actually gets
installed.

Signed-off-by: Maximilian Engelhardt <maxi@daemonizer.de>
[Hans van Kranenburg: split submitted patch in two]
Signed-off-by: Hans van Kranenburg <hans@knorrie.org>

d/xen-hypervisor-V.*: clean up unused files

These files are unused identical copies of the -V-F. files.

Signed-off-by: Maximilian Engelhardt <maxi@daemonizer.de>
[Hans van Kranenburg: split submitted patch in two]
Signed-off-by: Hans van Kranenburg <hans@knorrie.org>

d/xen-hypervisor-V-F.postrm: actually install script

Fix the filename of debian/xen-hypervisor-V-F.postrm by adding the
missing .vsn-in suffix so it actually gets installed.

This fixes the issue of not running update-grub when the Xen hypervisor
packages get removed, which resulted in a system that could not be
rebooted without interactive usage of the grub menu.

Signed-off-by: Maximilian Engelhardt <maxi@daemonizer.de>
Reviewed-by: Hans van Kranenburg <hans@knorrie.org>

d/xen-hypervisor-V-F.postinst.vsn-in: use reboot-required

When updating the xen hypervisor, touch the /run/reboot-required file to
indicate a reboot is required. Also add information about the package
requesting the reboot to the /run/reboot-required.pkgs file.

Closes: #862408
Signed-off-by: Maximilian Engelhardt <maxi@daemonizer.de>
Reviewed-by: Hans van Kranenburg <hans@knorrie.org>

fix spelling errors

Only spelling errors; no functional changes.

In docs/misc/dump-core-format.txt there are a few more instances of
'informations'. I'll leave that up to someone who can properly determine
how those sentences should be constructed.

Signed-off-by: Diederik de Haas <didi.debian@cknow.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
(cherry picked from commit ba6e78f0db820fbeea4df41fde4655020ca05928)

d/shuffle-binaries: Remove useless extra argument being passed in

This doesn't do anything. Really it is almost a bug the script /doesn't/
complain.

Signed-off-by: Elliott Mitchell <ehem+debian@m5p.com>
Acked-by: Hans van Kranenburg <hans@knorrie.org>

debian/xen.init: Load xen_acpi_processor on boot

This allows more control of processor state, potentially resulting in
reduced power usage. Alternatively simply more information on processor
use.

Signed-off-by: Elliott Mitchell <ehem+debian@m5p.com>
Acked-by: Hans van Kranenburg <hans@knorrie.org>

debian/rules: Set CC/LD to enable cross-building

Always set $(CC) and $(LD) when building. This has no effect on native
builds, but is needed for cross-builds to use the right compiler.

Signed-off-by: Elliott Mitchell <ehem+debian@m5p.com>
Acked-by: Hans van Kranenburg <hans@knorrie.org>

d/shuffle-boot-files: The Great Quotification

These should originate with the owner of a build system and are unlikely
to get hazardous values. This script though *should* work on a system
with such a bizzare setup. On general principle, add lots of double-quotes.

Signed-off-by: Elliott Mitchell <ehem+debian@m5p.com>
Acked-by: Hans van Kranenburg <hans@knorrie.org>

d/shuffle-binaries: Add quoting for potentially changeable variables

These should originate with the owner of a build system and are unlikely
to get hazardous values. This script though *should* work on a system
with such a bizarre setup. On general principle, add double-quotes.

Signed-off-by: Elliott Mitchell <ehem+debian@m5p.com>
Acked-by: Hans van Kranenburg <hans@knorrie.org>

d/shuffle-binaries: Make error detection/message overt

The reason for the `ls` at the end is pretty straightforward if you think
about it, mainly confirming the shuffle step did something. Yet that
is a bit non-obvious, and the error message produced on failure is poor
("No such file or directory" simply means the script failed somewhere?).
Add an overt error message.

Signed-off-by: Elliott Mitchell <ehem+debian@m5p.com>
Acked-by: Hans van Kranenburg <hans@knorrie.org>

xen/arm: traps: Don't panic when receiving an unknown debug trap

Even if debug trap are only meant for debugging purpose, it is quite
harsh to crash Xen if one of the trap sent by the guest is not handled.

So switch from a panic() to a printk().

Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
(cherry picked from commit 957708c2d1ae25d7375abd5e5e70c3043d64f1f1)

xen/arm: acpi: add BAD_MADT_GICC_ENTRY() macro

Imported from Linux commit b6cfb277378ef831c0fa84bcff5049307294adc6:

    The BAD_MADT_ENTRY() macro is designed to work for all of the subtables
    of the MADT.  In the ACPI 5.1 version of the spec, the struct for the
    GICC subtable (struct acpi_madt_generic_interrupt) is 76 bytes long; in
    ACPI 6.0, the struct is 80 bytes long.  But, there is only one definition
    in ACPICA for this struct -- and that is the 6.0 version.  Hence, when
    BAD_MADT_ENTRY() compares the struct size to the length in the GICC
    subtable, it fails if 5.1 structs are in use, and there are systems in
    the wild that have them.

    This patch adds the BAD_MADT_GICC_ENTRY() that checks the GICC subtable
    only, accounting for the difference in specification versions that are
    possible.  The BAD_MADT_ENTRY() will continue to work as is for all other
    MADT subtables.

    This code is being added to an arm64 header file since that is currently
    the only architecture using the GICC subtable of the MADT.  As a GIC is
    specific to ARM, it is also unlikely the subtable will be used elsewhere.

Fixes: aeb823bbacc2 ("ACPICA: ACPI 6.0: Add changes for FADT table.")
Signed-off-by: Al Stone <al.stone@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: "Rafael J. Wysocki" <rjw@rjwysocki.net>
    [catalin.marinas@arm.com: extra brackets around macro arguments]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Julien Grall <jgrall@amazon.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Tested-by: Elliott Mitchell <ehem+xen@m5p.com>
(cherry picked from commit 7056f2f89f03f2f804ac7e776c7b2b000cd716cd)

xen/arm: Introduce fw_unreserved_regions() and use it

Since commit 6e3e77120378 "xen/arm: setup: Relocate the Device-Tree
later on in the boot", the device-tree will not be kept mapped when
using ACPI.

However, a few places are calling dt_unreserved_regions() which expects
a valid DT. This will lead to a crash.

As the DT should not be used for ACPI (other than for detecting the
modules), a new function fw_unreserved_regions() is introduced.

It will behave the same way on DT system. On ACPI system, it will
unreserve the whole region.

Take the opportunity to clarify that bootinfo.reserved_mem is only used
when booting using Device-Tree.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
(cherry picked from commit 9c2bc0f24b2ba7082df408b3c33ec9a86bf20cf0)

xen/arm: Check if the platform is not using ACPI before initializing Dom0less

Dom0less requires a device-tree. However, since commit 6e3e77120378
"xen/arm: setup: Relocate the Device-Tree later on in the boot", the
device-tree will not get unflatten when using ACPI.

This will lead to a crash during boot.

Given the complexity to setup dom0less with ACPI (for instance how to
assign device?), we should skip any code related to Dom0less when using
ACPI.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Tested-by: Rahul Singh <rahul.singh@arm.com>
Reviewed-by: Rahul Singh <rahul.singh@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Tested-by: Elliott Mitchell <ehem+xen@m5p.com>
(cherry picked from commit dac867bf9adc1562a4cf9db5f89726597af13ef8)

xen/arm: acpi: The fixmap area should always be cleared during failure/unmap

Commit 022387ee1ad3 "xen/arm: mm: Don't open-code Xen PT update in
{set, clear}_fixmap()" enforced that each set_fixmap() should be
paired with a clear_fixmap(). Any failure to follow the model would
result to a platform crash.

Unfortunately, the use of fixmap in the ACPI code was overlooked as it
is calling set_fixmap() but not clear_fixmap().

The function __acpi_os_map_table() is reworked so:
    - We know before the mapping whether the fixmap region is big
    enough for the mapping.
    - It will fail if the fixmap is already in use. This is not a
    change of behavior but clarifying the current expectation to avoid
    hitting a BUG().

The function __acpi_os_unmap_table() will now call clear_fixmap().

Reported-by: Wei Xu <xuwei5@hisilicon.com>
Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
(cherry picked from commit 4d625ff3c3a939dc270b03654337568c30c5ab6e)

xen/acpi: Rework acpi_os_map_memory() and acpi_os_unmap_memory()

The functions acpi_os_{un,}map_memory() are meant to be arch-agnostic
while the __acpi_os_{un,}map_memory() are meant to be arch-specific.

Currently, the former are still containing x86 specific code.

To avoid this rather strange split, the generic helpers are reworked so
they are arch-agnostic. This requires the introduction of a new helper
__acpi_os_unmap_memory() that will undo any mapping done by
__acpi_os_map_memory().

Currently, the arch-helper for unmap is basically a no-op so it only
returns whether the mapping was arch specific. But this will change
in the future.

Note that the x86 version of acpi_os_map_memory() was already able to
able the 1MB region. Hence why there is no addition of new code.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Rahul Singh <rahul.singh@arm.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Tested-by: Rahul Singh <rahul.singh@arm.com>
Tested-by: Elliott Mitchell <ehem+xen@m5p.com>
(cherry picked from commit 1c4aa69ca1e1fad20b2158051eb152276d1eb973)

xen/arm: acpi: Don't fail if SPCR table is absent

Absence of a SPCR table likely means the console is a framebuffer. In
such case acpi_iomem_deny_access() should NOT fail.

Signed-off-by: Elliott Mitchell <ehem+xen@m5p.com>
Acked-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
(cherry picked from commit 861f0c110976fa8879b7bf63d9478b6be83d4ab6)

tools/python: Pass linker to Python build process

Unexpectedly the environment variable which needs to be passed is
$LDSHARED and not $LD. Otherwise Python may find the build `ld` instead
of the host `ld`.

Replace $(LDFLAGS) with $(SHLIB_LDFLAGS) as Python needs shared objects
it can load at runtime, not executables.

This uses $(CC) instead of $(LD) since Python distutils appends $CFLAGS
to $LDFLAGS which breaks many linkers.

Signed-off-by: Elliott Mitchell <ehem+xen@m5p.com>
Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
(cherry picked from commit 17d192e0238d6c714e9f04593b59597b7090be38)

[ Hans van Kranenburg ]
Fixed cherry-pick conflict because we have LIBEXEC_LIB=$(LIBEXEC_LIB) in
between in the same lines. The line wrap mess makes it a bit hard to
follow.

xen/rpi4: implement watchdog-based reset

The preferred method to reboot RPi4 is PSCI. If it is not available,
touching the watchdog is required to be able to reboot the board.

The implementation is based on
drivers/watchdog/bcm2835_wdt.c:__bcm2835_restart in Linux v5.9-rc7.

Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Acked-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Tested-by: Roman Shaposhnik <roman@zededa.com>
CC: roman@zededa.com
(cherry picked from commit 25849c8b16f2a5b7fcd0a823e80a5f1b590291f9)

d/rules: do not compress /usr/share/doc/xen/html

So, we have a xen-doc package which is already split off nicely as
binary package to contain the upstream html collection of documentation.

By default, dh_compress will gzip -9 .txt files which it considers to
be too large. However, in our case, there's some index.html with links,
and if this happens, the user who consciously installs the xen-doc
package, wanting to waste disk space on this ends up with a confusing
pile of broken links when navigating to file:///usr/share/doc/xen/html/
and browsing around.

The difference between disk space occupied when not compressing is 5.2M
vs. 5.1M before.

So, add an exclude on /usr/share/doc/xen/html

Closes: #942611
Reported-by: Diederik de Haas <didi.debian@cknow.org>
Signed-off-by: Hans van Kranenburg <hans@knorrie.org>

t/h/L/vif-common.sh: fix handle_iptable return value

A return statement without explicit value will return the value of the
last command executed before this line with return was encountered.

This is not what we want. return 0.

Closes: #955994
Fixes: 2e0814f971dd ("vif-common: disable handle_iptable")
Reported-by: Samuel Thibault <sthibault@debian.org>
Signed-off-by: Hans van Kranenburg <hans@knorrie.org>

tools: Partially revert "Cross-compilation fixes."

This partially reverts commit 16504669c5cbb8b195d20412aadc838da5c428f7.

Doesn't look like much of 16504669c5cbb8b195d20412aadc838da5c428f7
actually remains due to passage of time.

Of the 3, both Python and pygrub appear to mostly be building just fine
cross-compiling. The OCAML portion is being troublesome, this is going
to cause bug reports elsewhere soon. The OCAML portion though can
already be disabled by setting OCAML_TOOLS=n and shouldn't have this
extra form of disabling.

Signed-off-by: Elliott Mitchell <ehem+xen@m5p.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Acked-by: Wei Liu <wl@xen.org>
(cherry picked from commit 69953e2856382274749b617125cc98ce38198463)

tools: don't build/ship xenmon

It can't run with Python 3, and I'm not going to fix it.

Signed-off-by: Hans van Kranenburg <hans@knorrie.org>

tools/xl/bash-completion: also complete 'xen'

We have the `xen` alias for xl in Debian, since in the past it was a
command that could execute either xl or xm.

Now, it always does xl, so, complete the same stuff for it as we have
for xl.

Signed-off-by: Hans van Kranenburg <hans@knorrie.org>
[git-debrebase split: mixed commit: upstream part]

pygrub: Specify -rpath LIBEXEC_LIB when building fsimage.so

If LIBEXEC_LIB is not on the default linker search path, the python
fsimage.so module fails to find libfsimage.so.

Add the relevant directory to the rpath explicitly.

(This situation occurs in the Debian package, where
--with-libexec-libdir is used to put each Xen version's libraries and
utilities in their own directory, to allow them to be coinstalled.)

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>

pygrub: Set sys.path

We install libfsimage in a non-standard path for Reasons.
(See debian/rules.)

This patch was originally part of `tools-pygrub-prefix.diff'
(eg commit 51657319be54) and included changes to the Makefile to
change the installation arrangements (we do that part in the rules now
since that is a lot less prone to conflicts when we update) and to
shared library rpath (which is now done in a separate patch).

(Commit message rewritten by Ian Jackson.)

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
squash! pygrub: Set sys.path and rpath

hotplug-common: Do not adjust LD_LIBRARY_PATH

This is in the upstream script because on non-Debian systems, the
default install locations in /usr/local/lib might not be on the linker
path, and as a result the hotplug scripts would break.

A reason we might need it in Debian is our multiple version
coinstallation scheme. However, the hotplug scripts all call the
utilities via the wrappers, and the binaries are configured to load
from the right place anyway.

This setting is an annoyance because it requires libdir, which is an
arch-specific path but comes from a file we want to put in
xen-utils-common, an arch:all package.

So drop this setting.

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>

sysconfig.xencommons.in: Strip and debianize

Strip all options that are for stuff we don't ship, which is 1)
xenstored as stubdom and 2) xenbackendd, which seems to be dead code
anyway. [1]

It seems useful to give the user the option to revert to xenstored
instead of the default oxenstored if they really want.

[1] https://lists.xen.org/archives/html/xen-devel/2015-07/msg04427.html

Signed-off-by: Hans van Kranenburg <hans@knorrie.org>
Acked-by: Ian Jackson <ijackson@chiark.greenend.org.uk>

vif-common: disable handle_iptable

Also see Debian bug #894013. The current attempt at providing
anti-spoofing rules results in a situation that does not have any
effect. Also note that forwarding bridged traffic to iptables is not
enabled by default, and that for openvswitch users it does not make any
sense.

So, stop cluttering the live iptables ruleset.

This functionality seems to be introduced before 2004 and since then it
has never got some additional love.

It would be nice to have a proper discussion upstream about how Xen
could provide some anti mac/ip spoofing in the dom0. It does not seem to
be a trivial thing to do, since it requires having quite some knowledge
about what the domU is allowed to do or not (e.g. a domU can be a
router...).

Fix empty fields in first hypervisor log line

Instead of:

    (XEN) Xen version 4.11.1 (Debian )
    (@)
    (gcc (Debian 8.2.0-13) 8.2.0) debug=n
    Thu Jan  3 19:08:37 UTC 2019

I'd like to see:

    (XEN) Xen version 4.11.1 (Debian 4.11.1-1~)
    (pkg-xen-devel@lists.alioth.debian.org)
    (gcc (Debian 8.2.0-13) 8.2.0) debug=n
    Thu Jan  3 22:44:00 CET 2019

The substitution was broken since the great packaging refactoring,
because the directory in which the build is done changed.

Also, use the Maintainer address from debian/control instead of the most
recent changelog entry. If someone wants to use the address to ask a
question, they will end up at the team mailing list, which is better
than an individual person.

docs/man/xen-vbd-interface.7: Provide properly-formatted NAME section

This manpage was omitted from
docs/man: Provide properly-formatted NAME sections
because I was previously building with markdown not installed.

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>

shim: Provide separate install-shim target

When building on a 32-bit userland, the user wants to build 32-bit
tools and a 64-bit hypervisor.  This involves setting XEN_TARGET_ARCH
to different values for the tools build and the hypervisor build.

So the user must invoke the tools build and the hypervisor build
separately.

However, although the shim is done by the tools/firmware Makefile, its
bitness needs to be the same as the hypervisor, not the same as the
tools.  When run with XEN_TARGET_ARCH=x86_32, it it skipped, which is
wrong.

So the user must invoke the shim build separately.  This can be done
with
   make -C tools/firmware/xen-dir XEN_TARGET_ARCH=x86_64

However, tools/firmware/xen-dir has no `install' target.  The
installation of all `firmware' is done in tools/firmware/Makefile.  It
might be possible to fix this, but it is not trivial.  For example,
the definitions of INST_DIR and DEBG_DIR would need to be copied, as
would an appropriate $(INSTALL_DIR) call.

For now, provide an `install-shim' target in tools/firmware/Makefile.

This has to be called from `install' of course.  We can't make it
a dependency of `install' because it might be run before `all' has
completed.  We could make it depend on a `shim' target but such
a target is nearly impossible to write because everything is done by
the inflexible subdir-$@ machinery.

The overally result of this patch is that existing make invocations
work as before.  But additionally, the user can say
  make -C tools/firmware install-shim XEN_TARGET_ARCH=x86_64
to install the shim.  The user must have built it already.
Unlike the build rune, this install-rune is properly conditional
so it is OK to call on ARM.

What a mess.

Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>

tools/firmware/Makefile: CONFIG_PV_SHIM: enable only on x86_64

Previously this was *dis*abled for x86_*32*. But if someone should
run some of this Makefile on ARM, say, it ought not to be built
either.

Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>

tools/firmware/Makfile: Respect caller's CONFIG_PV_SHIM

This makes it easier to disable the shim build. (In Debian we need to
build the shim separately because it needs different compiler flags
and a different XEN_COMPILE_ARCH.

Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>

Revert "pvshim: make PV shim build selectable from configure"

This reverts commit 8845155c831c59e867ee3dd31ee63e0cc6c7dcf2.

This upstream change changes stuff that breaks our very fragile mess
that builds the shim when it needs to, and doesn't when it should not.

The result is that it's missing in the end for the i386 build... :|

    dh_install: warning: Cannot find (any matches for)
    "usr/lib/debug/usr/lib/xen-*/boot/*" (tried in ., debian/tmp)

    dh_install: warning: xen-utils-4.14 missing files:
    usr/lib/debug/usr/lib/xen-*/boot/*
    dh_install: error: missing files, aborting

.gitignore: Add configure output which we always delete and regenerate

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>

autoconf: Provide libexec_libdir_suffix

This is going to be used to put libfsimage.so into a path containing
the multiarch triplet.

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>

tools-libfsimage-prefix.diff

\o/

Do not build the instruction emulator

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>

tools/tests/x86_emulator: Pass -no-pie -fno-pic to gcc on x86_32

The current build fails with GCC6 on Debian sid i386 (unstable):

/tmp/ccqjaueF.s: Assembler messages:
/tmp/ccqjaueF.s:3713: Error: missing or invalid displacement expression `vmovd_to_reg_len@GOT'

This is due to the combination of GCC6, and Debian's decision to
enable some hardening flags by default (to try to make runtime
addresses less predictable):
  https://wiki.debian.org/Hardening/PIEByDefaultTransition

This is of no benefit for the x86 instruction emulator test, which is
a rebuild of the emulator code for testing purposes only.  So pass
options to disable this.

These options will be no-ops if they are the same as the compiler
default.

On amd64, the -fno-pic breaks the build in a different way.  So do
this only on i386.

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
Gbp-Pq: Topic misc
Gbp-Pq: Name toolstestsx86_emulator-pass--no-pie--fno.patch

Remove static solaris support from pygrub

Patch-Name: tools-pygrub-remove-static-solaris-support

Gbp-Pq: Topic misc
Gbp-Pq: Name tools-pygrub-remove-static-solaris-support

Do not ship COPYING into /usr/include

This is not wanted in Debian. COPYING ends up in
/usr/share/doc/xen-*copyright.

Patch-Name: tools-include-no-COPYING.diff

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>

config-prefix.diff

Patch-Name: config-prefix.diff

Gbp-Pq: Topic prefix-abiname
Gbp-Pq: Name config-prefix.diff

version

Delete configure output

These autogenerated files are not useful in Debian; dh_autoreconf will
regenerate them.

If this patch does not apply when rebasing, you can simply delete the
files again.

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>

Delete config.sub and config.guess

dh_autoreconf will provide these back.

If this patch does not apply when rebasing, you can simply delete the
files again.

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>

Update changelog for new upstream 4.14.0+88-g1d1d1f5391

[git-debrebase changelog: new upstream 4.14.0+88-g1d1d1f5391]

Update to upstream 4.14.0+88-g1d1d1f5391

[git-debrebase anchor: new upstream 4.14.0+88-g1d1d1f5391, merge]

finalise changelog for unstable

Signed-off-by: Ian Jackson <iwj@xenproject.org>

changelog: document ~exp2

Signed-off-by: Ian Jackson <iwj@xenproject.org>

x86/vioapic: fix usage of index in place of GSI in vioapic_write_redirent

The usage of idx instead of the GSI in vioapic_write_redirent when
accessing gsi_assert_count can cause a PVH dom0 with multiple
vIO-APICs to lose interrupts in case a pin of a IO-APIC different than
the first one is unmasked with pending interrupts.

Switch to use gsi instead to fix the issue.

Fixes: 9f44b08f7d0e4 ('x86/vioapic: introduce support for multiple vIO APICS')
Reported-by: Manuel Bouyer <bouyer@antioche.eu.org>
Signed-off-by: Roger Pau Monné <roge.rpau@citrix.com>
Tested-by: Manuel Bouyer <bouyer@antioche.eu.org>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit: 3ae469af8e680df31eecd0a2ac6a83b58ad7ce53
master date: 2020-11-30 14:06:38 +0100

xen/events: rework fifo queue locking

Two cpus entering evtchn_fifo_set_pending() for the same event channel
can race in case the first one gets interrupted after setting
EVTCHN_FIFO_PENDING and when the other one manages to set
EVTCHN_FIFO_LINKED before the first one is testing that bit. This can
lead to evtchn_check_pollers() being called before the event is put
properly into the queue, resulting eventually in the guest not seeing
the event pending and thus blocking forever afterwards.

Note that commit 5f2df45ead7c1195 ("xen/evtchn: rework per event channel
lock") made the race just more obvious, while the fifo event channel
implementation had this race forever since the introduction and use of
per-channel locks, when an unmask operation was running in parallel with
an event channel send operation.

Using a spinlock for the per event channel lock had turned out
problematic due to some paths needing to take the lock are called with
interrupts off, so the lock would need to disable interrupts, which in
turn broke some use cases related to vm events.

For avoiding this race the queue locking in evtchn_fifo_set_pending()
needs to be reworked to cover the test of EVTCHN_FIFO_PENDING,
EVTCHN_FIFO_MASKED and EVTCHN_FIFO_LINKED, too. Additionally when an
event channel needs to change queues both queues need to be locked
initially, in order to avoid having a window with no lock held at all.

Reported-by: Jan Beulich <jbeulich@suse.com>
Fixes: 5f2df45ead7c1195 ("xen/evtchn: rework per event channel lock")
Fixes: de6acb78bf0e137c ("evtchn: use a per-event channel lock for sending events")
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit: 71ac522909e9302350a88bc378be99affa87067c
master date: 2020-11-30 14:05:39 +0100

x86/DMI: fix SMBIOS pointer range check

Forever since its introduction this has been using an inverted relation
operator.

Fixes: 54057a28f22b ("x86: support SMBIOS v3")
Signed-off-by: Jan Beulich <JBeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: 6befe598706218673b14710d90d00ce90763b372
master date: 2020-11-24 11:25:29 +0100

xen/events: access last_priority and last_vcpu_id together

The queue for a fifo event is depending on the vcpu_id and the
priority of the event. When sending an event it might happen the
event needs to change queues and the old queue needs to be kept for
keeping the links between queue elements intact. For this purpose
the event channel contains last_priority and last_vcpu_id values
elements for being able to identify the old queue.

In order to avoid races always access last_priority and last_vcpu_id
with a single atomic operation avoiding any inconsistencies.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
master commit: 1277cb9dc5e966f1faf665bcded02b7533e38078
master date: 2020-11-24 11:23:42 +0100

x86/vpt: fix build with old gcc

I believe it was the XSA-336 fix (42fcdd42328f "x86/vpt: fix race when
migrating timers between vCPUs") which has unmasked a bogus
uninitialized variable warning. This is observable with gcc 4.3.4, but
only on 4.13 and older; it's hidden on newer versions apparently due to
the addition to _read_unlock() done by 12509bbeb9e3 ("rwlocks: call
preempt_disable() when taking a rwlock").

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: f2c620aa062767b318267d678ae249dcb637b870
master date: 2020-11-18 12:38:01 +0100

xen/evtchn: revert 52e1fc47abc3a0123

With the event channel lock no longer disabling interrupts commit
52e1fc47abc3a0123 ("evtchn/Flask: pre-allocate node on send path") can
be reverted again.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
master commit: b5ad37f8e9284cc147218f7a5193d739ae7b956f
master date: 2020-11-10 14:37:15 +0100

xen/evtchn: rework per event channel lock

Currently the lock for a single event channel needs to be taken with
interrupts off, which causes deadlocks in some cases.

Rework the per event channel lock to be non-blocking for the case of
sending an event and removing the need for disabling interrupts for
taking the lock.

The lock is needed for avoiding races between event channel state
changes (creation, closing, binding) against normal operations (set
pending, [un]masking, priority changes).

Use a rwlock, but with some restrictions:

- Changing the state of an event channel (creation, closing, binding)
  needs to use write_lock(), with ASSERT()ing that the lock is taken as
  writer only when the state of the event channel is either before or
  after the locked region appropriate (either free or unbound).

- Sending an event needs to use read_trylock() mostly, in case of not
  obtaining the lock the operation is omitted. This is needed as
  sending an event can happen with interrupts off (at least in some
  cases).

- Dumping the event channel state for debug purposes is using
  read_trylock(), too, in order to avoid blocking in case the lock is
  taken as writer for a long time.

- All other cases can use read_lock().

Fixes: e045199c7c9c54 ("evtchn: address races with evtchn_reset()")
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <jgrall@amazon.com>
xen/events: fix build

Commit 5f2df45ead7c1195 ("xen/evtchn: rework per event channel lock")
introduced a build failure for NDEBUG builds.

Fixes: 5f2df45ead7c1195 ("xen/evtchn: rework per event channel lock")
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
master commit: 5f2df45ead7c1195142f68b7923047a1e9479d54
master date: 2020-11-10 14:36:15 +0100
master commit: 53bacb86f496fdb11560d9e3b361bca7de60d268
master date: 2020-11-11 08:56:21 +0100

memory: fix off-by-one in XSA-346 change

The comparison against ARRAY_SIZE() needs to be >= in order to avoid
overrunning the pages[] array.

This is XSA-355.

Fixes: 5777a3742d88 ("IOMMU: hold page ref until after deferred TLB flush")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
master commit: 9b156bcc3ffcc7949edd4460b718a241e87ae302
master date: 2020-11-24 14:01:31 +0100

debian/changelog: finalise 4.14.0+80-gd101b417b7-1~exp1

debian/changelog: fix typo

Partially revert "debian/rules: Combine shared Make args"

This reverts part of commit af30268ed73397d24acaee83477ff555a43df5e4.

The change caused the following FTBFS on i386:

    gcc    -Wl,-z,relro -Wl,-z,now -pthread -Wl,-soname
    -Wl,libxentoolcore.so.1 -shared -Wl,--version-script=libxentoolcore.map
    -o libxentoolcore.so.1.0 handlereg.opic
    /usr/bin/ld: i386:x86-64 architecture of input file `handlereg.opic' is
    incompatible with i386 output
    /usr/bin/ld: handlereg.opic: file class ELFCLASS64 incompatible with
    ELFCLASS32
    /usr/bin/ld: final link failed: file in wrong format
    collect2: error: ld returned 1 exit status

"For $(make_args_xen), ${XEN_TARGET_ARCH} gets $(xen_arch_$(flavour)),
but for $(make_args_tools), ${XEN_TARGET_ARCH} gets
$(xen_arch_$(DEB_HOST_ARCH))."

Update changelog for new upstream 4.14.0+80-gd101b417b7

[git-debrebase changelog: new upstream 4.14.0+80-gd101b417b7]

Update to upstream 4.14.0+80-gd101b417b7

[git-debrebase anchor: new upstream 4.14.0+80-gd101b417b7, merge]

WIP debian/changelog: start 4.14.0-1~exp2

Signed-off-by: Hans van Kranenburg <hans@knorrie.org>

d/xen-utils-common.xen.init: disable oom killer for xenstored

In case of oom killer terminating some process, we'd rather not see
xenstored go. Xenstored has an in-memory database, and when starting the
process again, it would be empty, which is very inconvenient. Xenstored
should already score quite low and have a fairly low memory footprint,
but according to the user report, it happened.

Closes: #961511
Suggested-by: Samuel Thibault <sthibault@debian.org>
Signed-off-by: Hans van Kranenburg <hans@knorrie.org>
Acked-by: Ian Jackson <ijackson@chiark.greenend.org.uk>

x86/msr: Disallow guest access to the RAPL MSRs

Researchers have demonstrated using the RAPL interface to perform a
differential power analysis attack to recover AES keys used by other cores in
the system.

Furthermore, even privileged guests cannot use this interface correctly, due
to MSR scope and vcpu scheduling issues. The interface would want to be
paravirtualised to be used sensibly.

Disallow access to the RAPL MSRs completely, as well as other MSRs which
potentially access fine grain power information.

This is part of XSA-351.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>

x86/msr: fix handling of MSR_IA32_PERF_{STATUS/CTL}

Currently a PV hardware domain can also be given control over the CPU
frequency, and such guest is allowed to write to MSR_IA32_PERF_CTL.
However since commit 322ec7c89f6 the default behavior has been changed
to reject accesses to not explicitly handled MSRs, preventing PV
guests that manage CPU frequency from reading
MSR_IA32_PERF_{STATUS/CTL}.

Additionally some HVM guests (Windows at least) will attempt to read
MSR_IA32_PERF_CTL and will panic if given back a #GP fault:

vmx.c:3035:d8v0 RDMSR 0x00000199 unimplemented
d8v0 VIRIDIAN CRASH: 3b c0000096 fffff806871c1651 ffffda0253683720 0

Move the handling of MSR_IA32_PERF_{STATUS/CTL} to the common MSR
handling shared between HVM and PV guests, and add an explicit case
for reads to MSR_IA32_PERF_{STATUS/CTL}.

Restore previous behavior and allow PV guests with the required
permissions to read the contents of the mentioned MSRs. Non privileged
guests will get 0 when trying to read those registers, as writes to
MSR_IA32_PERF_CTL by such guest will already be silently dropped.

Fixes: 322ec7c89f6 ('x86/pv: disallow access to unknown MSRs')
Fixes: 84e848fd7a1 ('x86/hvm: disallow access to unknown MSRs')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
(cherry picked from commit 3059178798a23ba870ff86ff54d442a07e6651fc)

xen/arm: Always trap AMU system registers

The Activity Monitors Unit (AMU) has been introduced by ARMv8.4. It is
considered to be unsafe to be expose to guests as they might expose
information about code executed by other guests or the host.

Arm provided a way to trap all the AMU system registers by setting
CPTR_EL2.TAM to 1.

Unfortunately, on older revision of the specification, the bit 30 (now
CPTR_EL1.TAM) was RES0. Because of that, Xen is setting it to 0 and
therefore the system registers would be exposed to the guest when it is
run on processors with AMU.

As the bit is mark as UNKNOWN at boot in Armv8.4, the only safe solution
for us is to always set CPTR_EL1.TAM to 1.

Guest trying to access the AMU system registers will now receive an
undefined instruction. Unfortunately, this means that even well-behaved
guest may fail to boot because we don't sanitize the ID registers.

This is a known issues with other Armv8.0+ features (e.g. SVE, Pointer
Auth). This will taken care separately.

This is part of XSA-351 (or XSA-93 re-born).

Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
(cherry picked from commit 628e1becb6fb121475a6ce68e3f1cb4499851255)

tools/libs/stat: use memcpy instead of strncpy in getBridge

Use memcpy in getBridge to prevent gcc warnings about truncated
strings. We know that we might truncate it, so the gcc warning
here is wrong.
Revert previous change changing buffer sizes as bigger buffers
are not needed.

Signed-off-by: Bertrand Marquis <bertrand.marquis@arm.com>
Acked-by: Wei Liu <wl@xen.org>
(cherry picked from commit 40fe714ca4245a9716264fcb3ee8df42c3650cf6)

tool/libs/light: Fix libxenlight gcc warning

Fix gcc10 compilation warning about uninitialized variable by setting
it to 0.

Signed-off-by: Bertrand Marquis <bertrand.marquis@arm.com>
Acked-by: Wei Liu <wl@xen.org>
(cherry picked from commit 0241809bf838875615797f52af34222e5ab8e98f)

tools/libxc: report malloc errors in writev_exact

The caller of writev_exact should be notified about malloc errors
when dealing with partial writes.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wl@xen.org>
(cherry picked from commit 0d8d289af7a679c028462c4ed5d98586f9ef9648)

tools/libs/stat: fix broken build

Making getBridge() static triggered a build error with some gcc versions:

error: 'strncpy' output may be truncated copying 15 bytes from a string of
length 255 [-Werror=stringop-truncation]

Fix that by using a buffer with 256 bytes instead.

Fixes: 6d0ec053907794 ("tools: split libxenstat into new tools/libs/stat directory")
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wl@xen.org>
(cherry picked from commit c8099e48c3dbb8c7399551a265756ecf354eece2)

tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating

The do_ls() function has somewhat inconsistent handling of errors.

If reading the node's contents with xs_read() fails, then do_ls() will
just quietly not display the contents.

If reading the node's permissions with xs_get_permissions() fails, then
do_ls() will print a warning, continue, and ultimately won't exit with
an error code (unless another error happens).

If recursing into the node with xs_directory() fails, then do_ls() will
abort immediately, not printing any further nodes.

For persistent failure modes — such as ENOENT because a node has been
removed, or EACCES because it has had its permisions changed since the
xs_directory() on the parent directory returned its name — it's
obviously quite likely that if either of the first two errors occur for
a given node, then so will the third and thus xenstore-ls will abort.

The ENOENT one is actually a fairly common case, and has caused tools to
fail to clean up a network device because it *apparently* already
doesn't exist in xenstore.

There is a school of thought that says, "Well, xenstore-ls returned an
error. So the tools should not trust its output."

The natural corollary of this would surely be that the tools must re-run
xenstore-ls as many times as is necessary until its manages to exit
without hitting the race condition. I am not keen on that conclusion.

For the specific case of ENOENT it seems reasonable to declare that,
but for the timing, we might as well just not have seen that node at
all when calling xs_directory() for the parent. By ignoring the error,
we give acceptable output.

The issue can be reproduced as follows:

(dom0) # for a in `seq 1 1000` ; do
              xenstore-write /local/domain/2/foo/$a $a ;
         done

Now simultaneously:

(dom0) # for a in `seq 1 999` ; do
              xenstore-rm /local/domain/2/foo/$a ;
         done
(dom2) # while true ; do
              ./xenstore-ls -p /local/domain/2/foo | grep -c 1000 ;
         done

We should expect to see node 1000 in the output, every time.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
(cherry picked from commit beb105af19cc3e58e2ed49224cfe190a736e5fa0)

tools/xenpmd: Fix gcc10 snprintf warning

Add a check for snprintf return code and ignore the entry if we get an
error. This should in fact never happen and is more a trick to make gcc
happy and prevent compilation errors.

This is solving the following gcc warning when compiling for arm32 host
platforms with optimization activated:
xenpmd.c:92:37: error: '%s' directive output may be truncated writing
between 4 and 2147483645 bytes into a region of size 271
[-Werror=format-truncation=]

This is also solving the following Debian bug:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=970802

Signed-off-by: Bertrand Marquis <bertrand.marquis@arm.com>
Acked-by: Wei Liu <wl@xen.org>
(cherry picked from commit 0dfddb2116e3757f77a691a3fe335173088d69dc)

libxl: fix -Werror=stringop-truncation in libxl__prepare_sockaddr_un

In file included from /usr/include/string.h:495,
                 from libxl_internal.h:38,
                 from libxl_utils.c:20:
In function 'strncpy',
    inlined from 'libxl__prepare_sockaddr_un' at libxl_utils.c:1262:5:
/usr/include/bits/string_fortified.h:106:10: error: '__builtin_strncpy' specified bound 108 equals destination size [-Werror=stringop-truncation]
  106 |   return __builtin___strncpy_chk (__dest, __src, __len, __bos (__dest));
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
(cherry picked from commit fff1b7f50e75ad9535c86f3fcf425b4945c50a1c)

libxl: workaround gcc 10.2 maybe-uninitialized warning

It seems xlu_pci_parse_bdf has a state machine that is too complex for
gcc to understand. The build fails with:

    libxlu_pci.c: In function 'xlu_pci_parse_bdf':
    libxlu_pci.c:32:18: error: 'func' may be used uninitialized in this function [-Werror=maybe-uninitialized]
       32 |     pcidev->func = func;
          |     ~~~~~~~~~~~~~^~~~~~
    libxlu_pci.c:51:29: note: 'func' was declared here
       51 |     unsigned dom, bus, dev, func, vslot = 0;
          |                             ^~~~
    libxlu_pci.c:31:17: error: 'dev' may be used uninitialized in this function [-Werror=maybe-uninitialized]
       31 |     pcidev->dev = dev;
          |     ~~~~~~~~~~~~^~~~~
    libxlu_pci.c:51:24: note: 'dev' was declared here
       51 |     unsigned dom, bus, dev, func, vslot = 0;
          |                        ^~~
    libxlu_pci.c:30:17: error: 'bus' may be used uninitialized in this function [-Werror=maybe-uninitialized]
       30 |     pcidev->bus = bus;
          |     ~~~~~~~~~~~~^~~~~
    libxlu_pci.c:51:19: note: 'bus' was declared here
       51 |     unsigned dom, bus, dev, func, vslot = 0;
          |                   ^~~
    libxlu_pci.c:29:20: error: 'dom' may be used uninitialized in this function [-Werror=maybe-uninitialized]
       29 |     pcidev->domain = domain;
          |     ~~~~~~~~~~~~~~~^~~~~~~~
    libxlu_pci.c:51:14: note: 'dom' was declared here
       51 |     unsigned dom, bus, dev, func, vslot = 0;
          |              ^~~
    cc1: all warnings being treated as errors

Workaround it by setting the initial value to invalid value (0xffffffff)
and then assert on each value being set. This way we mute the gcc
warning, while still detecting bugs in the parse code.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
(cherry picked from commit d25cc3ec93ebda030349045d2c7fa14ffde07ed7)